Information on group members:
%matplotlib inline
import csv
import matplotlib.pyplot as plt
from colour import Color
1.1) Get acquainted with the below "City" class definition. It has three major fields: name, city's longitude, and latitude. The coordinate's input is in the format "00°00'E/N," and it is converted to a list [degrees, minutes]. Also, these coordinates are also simply converted and stored as x and y coordinates.
class City:
def __init__(self, name, long, lat):
self.name = name
self.long = [float(long.split("°")[0]), float(long.split("°")[1].replace("'",'').replace("E",''))]
self.x = self.long[0]*60 + self.long[1]
self.lat = [float(lat.split("°")[0]), float(lat.split("°")[1].replace("'",'').replace("N",''))]
self.y = self.lat[0]*60 + self.lat[1]
def __repr__(self):
return str(self.name) + " " + str(self.lat) + " " + str(self.long)
1.2) The below piece of code loads cities data from the cities.csv file and creates a list of cities.
cities = []
with open('cities.csv', newline='', encoding="utf-8") as csvfile:
r = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in r:
cities.append(City(row[0], row[1], row[2]))
for c in cities[:10]: ### TEST
print(c)
Adamów (siedleckie) [51.0, 45.0] [22.0, 15.0] Adamów (zamojskie) [50.0, 36.0] [23.0, 10.0] Adamówka [50.0, 16.0] [22.0, 42.0] Aleksandrów [51.0, 16.0] [19.0, 59.0] Aleksandrów Kujawski [52.0, 53.0] [18.0, 42.0] Aleksandrów Łódzki [51.0, 49.0] [19.0, 19.0] Alwernia [50.0, 4.0] [19.0, 32.0] Andrespol [51.0, 44.0] [19.0, 37.0] Andrychów [49.0, 52.0] [19.0, 20.0] Andrzejewo [52.0, 50.0] [22.0, 12.0]
1.3) The function below plots all cities according to their X and Y coordinates. Note that there are so many cities that Poland's boundaries can be observed. Feel free to modify the code. The unrequired input "edges" is a list of pairs of indices indicating edges to be drawn.
### TODO DEL GRADIENT
def plotMap(cities, edges=[], colour_change=False, size = (6,6)):
fig=plt.figure(figsize=size, dpi= 100, facecolor='w', edgecolor='k')
if colour_change:
colors = [color.hex for color in list(Color("red").range_to(Color("green"), len(cities)))]
else:
colors = ["red"] * len(edges)
for i, e in enumerate(edges):
color = colors[i] if colour_change else "red"
X = [cities[e[0]].x, cities[e[1]].x]
Y = [cities[e[0]].y, cities[e[1]].y]
plt.plot(X, Y, lw=1, ls="-", marker="", color=color)
X = [c.x for c in cities]
Y = [c.y for c in cities]
plt.plot(X, Y, ms=2, ls="", marker="o")
plt.show()
plotMap(cities, edges=[[1,200]])
1.3) The dataset consists of 2249 cities. It may be too much when it comes to implementing the algorithm for constructing the minimum spanning tree (computational burden). Therefore, you can consider only every n-th row from the dataset to ensure that the computations will be completed in a reasonable time.
n = 3
cities_small = cities[2::n]
plotMap(cities_small, edges=[[0,200]])
1.4) Finish the function below for constructing a distance matrix. Each i-th row should contain distances from the i-th city to other cities. For convenience and to improve the algorithm's computational complexity for constructing the spanning tree, it is suggested to keep rows sorted according to distance. Rows may contain information on the j-th indices of the respective destinations along with the distances in a tuple (j-th indice, distance).
You can use the Euclidean distance for computations. You can use the x and .y fields that store the geographical coordinates. This way of formulating a distance does not have much sense because the computation would be based on angles. Nonetheless, it is allowed in this exercise to do so, but if you wish, you can calculate the distances correctly by computing arc lengths between two points on a sphere - Earth.
Toruń = next(c for c in cities if c.name == "Toruń")
print(Toruń)
Bydgoszcz = next(c for c in cities if c.name == "Bydgoszcz")
print(Bydgoszcz)
Toruń [53.0, 2.0] [18.0, 37.0] Bydgoszcz [53.0, 7.0] [18.0, 0.0]
Toruń.x
1117.0
Toruń.long
[18.0, 37.0]
Toruń.lat[0] + Toruń.lat[1]
55.0
import math
def haversine_distance(city1, city2):
# Radius of the Earth in kilometers
R = 6371.0
# Convert latitude and longitude from degrees to radians
lat1 = math.radians(city1.y / 60)
lon1 = math.radians(city1.x / 60)
lat2 = math.radians(city2.y / 60)
lon2 = math.radians(city2.x / 60)
# Haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = math.sin(dlat / 2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon / 2)**2
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
# Distance in kilometers
distance = R * c
return distance
# Example usage
city1 = next(c for c in cities if c.name == "Toruń")
city2 = next(c for c in cities if c.name == "Bydgoszcz")
distance = haversine_distance(city1, city2)
print(f"Distance between {city1.name} and {city2.name}: {distance:.2f} km")
Distance between Toruń and Bydgoszcz: 42.22 km
def euclidean_distance(city1, city2):
return math.sqrt((city1.x - city2.x)**2 + (city1.y - city2.y)**2)
# Example usage
city1 = next(c for c in cities if c.name == "Toruń")
city2 = next(c for c in cities if c.name == "Bydgoszcz")
distance = euclidean_distance(city1, city2)
print(f"Euclidean distance between {city1.name} and {city2.name}: {distance:.2f}")
Euclidean distance between Toruń and Bydgoszcz: 37.34
def getSortedDistances(cities):
distances = []
for i, city1 in enumerate(cities):
city1_distances = []
for j, city2 in enumerate(cities):
if i != j:
distance = haversine_distance(city1, city2)
city1_distances.append((j, distance))
city1_distances.sort(key=lambda x: x[1])
distances.append(city1_distances)
return distances
D = getSortedDistances(cities_small)
print(D[0][:5])
[(550, 11.00036580917997), (287, 16.846146926412768), (27, 17.33723228826242), (474, 17.557950703065398), (300, 18.95399028178398)]
1.5) Auxiliary function for finding a corresponding index in the data set for a provided city name.
def getIdx(name, cities):
for i, c in enumerate(cities):
if name == c.name: return i
return None
print(getIdx("Hel", cities_small))
178
1.6) Complete the below greedy algorithm for constructing the minimum spanning tree. As an input, it should return a list of edges of the constructed subgraph, which can then be used in the plotMap function to show the tree structure. You can set the city of Hel as an initial node.
def getSpanningTree(stratingNode, cities, D):
N = len(cities)
IN_TREE = set([stratingNode])
edges = []
while len(IN_TREE) < N: #we chceck if we have all cities in the tree
min_edge = None # here we will store the edge with the smallest distance
min_distance = float('inf') #minimal distance
for u in IN_TREE: #for each city in the tree
for v, dist in D[u]: #for each city connected with city u that is already t=in the tree
if v not in IN_TREE and dist < min_distance: #if city v is not in the tree and the distance is smaller than the smallest distance
min_edge = (u, v) #we store the edge
min_distance = dist #we store the distance
if min_edge: #if we found the edge that is not None
edges.append(min_edge) #we add the edge to the list of edges
IN_TREE.add(min_edge[1]) #we add the city to the tree
return edges
edges1 = getSpanningTree(getIdx("Hel", cities_small), cities_small, D)
plotMap(cities_small, edges=edges1)
plotMap(cities_small, edges=edges1, size=(10,10))
1.6) Modify the code so that it displays the results after 10 intermediate steps. In this way, you can get a better insight into how the algorithm constructs the spanning tree. Additionally, you can use, e.g., a gradient to distinguish between different iterations of the algorithm when plotting the map (e.g., 1-st iteration = 100% red, last iteration == 100% green).
def getSpanningTree(stratingNode, cities, D):
N = len(cities)
IN_TREE = set([stratingNode])
edges = []
step = 0
while len(IN_TREE) < N: # we check if we have all cities in the tree
min_edge = None # here we will store the edge with the smallest distance
min_distance = float('inf') # minimal distance
for u in IN_TREE: # for each city in the tree
for v, dist in D[u]: # for each city connected with city u that is already in the tree
if v not in IN_TREE and dist < min_distance: # if city v is not in the tree and the distance is smaller than the smallest distance
min_edge = (u, v) # we store the edge
min_distance = dist # we store the distance
if min_edge: # if we found the edge that is not None
edges.append(min_edge) # we add the edge to the list of edges
IN_TREE.add(min_edge[1]) # we add the city to the tree
step += 1 #incrementing the step size after each iteration- adding the edge so one city more in the tree
if step % 10 == 0 or len(IN_TREE) == N: # we plot the map every 10 steps or when we have all cities in the tree
plotMap(cities, edges, colour_change=True)
return edges
edges1 = getSpanningTree(getIdx("Hel", cities_small), cities_small, D)
# plotMap(cities_small, edges=edges1, colour_change=True)
print("Final minimal spanning tree of small cities:")
plotMap(cities_small, edges=edges1, colour_change=True, size =(10,10))
Final minimal spanning tree of small cities: